Reductions in Distributed Computing 



Part I: Consensus and Atomic Commitment Tasks 
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Abstract 



We introduce several notions of reduction in distributed computing, and investi- 
gate reduction properties of two fundamental agreement tasks, namely Consensus and 
0^ ' Atomic Commitment. 

(N ! We first propose the notion of reduction "a la Karp", an analog for distributed 

computing of the classical Karp reduction. We then define a weaker reduction which 
, is the analog of Cook reduction. These two reductions are called K- reduction and 

C-rcduction, respectively. We also introduce the notion of C*-reduction which has no 
counterpart in classical (namely, non distributed) systems, and which naturally arises 
^ ' when dealing with symmetric tasks. 

We establish various reducibility and irreducibility theorems with respect to these 
three reductions. Our main result is an incomparability statement for Consensus and 
Atomic Commitment tasks: we show that they are incomparable with respect to the 
l^-j ' C-reduction, except when the resiliency degree is 1, in which case Atomic Commitment 

is strictly harder than Consensus. A side consequence of these results is that our notion 
of C-reduction is strictly weaker than the one of i\-rcduction, even for unsolvable tasks. 

(N 

^ ; 1 Introduction 

q , The purpose of this paper is to develop a formalism for addressing the problems of reduction 

in distributed computing, and to investigate reductions properties of various agreement 
problems, namely, Consensus and Atomic Commitment problems (in Part I), and their 
generalizations defined by the so-called fc-Threshold Agreement problems [Jj (in Part II). 

The notion of reduction plays a key role in the theory of computability. A reduction 
from some problem A to another one B is a way of converting A to B in such a way that 
a method for solving B yields a method for solving A. The existence of such a reduction 
establishes - by definition - that B is at least as hard to solve as A, or in other terms, that 
the degree of unsolv ability 1 of B is not less than the one of A. 

Several notions of reducibility - hence of degrees of unsolvability - have been formally 
defined and investigated in various frameworks. Let us only mention the various kinds 
of effective reducibilities used in recursive functions theory (see for instance |24j ) . and 
computation bounded reducibilities which play a key role in the theory of computational 
complexity, notably since the introduction of polynomial-time bounded reducibilities by 
Cook and Karp JS] (see also |2U| for a discussion and references on polynomial-time 
reducibilities). 

Taboratoire LIX, Ecole Polytechnique, 91128 Palaiseau Cedex, France 

1 Given some reduction relation, it might seem natural to call an equivalence classes - with respect to 
the reduction - which involve solvable problems a "degree of solvability" ; it is customary, however, to speak 
without exception of "degree of unsolvability" . 
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Concerning distributed computations, many reducibility results have been established 
and used to show that some problems are not solvable (see for instance |15l 1131 1231 E])- 
Observe however that most of them are derived using an informal notion of reduction - a 
significant exception being the work by Dwork and Skeen |14j on patterns of communica- 
tion. 2 This lack of formal foundations for the reducibility notion in distributed computing 
is not a serious issue in the proofs of the above reducibility results. Indeed, they are estab- 
lished by means of constructive arguments which should be easily formalized in any sensible 
rigorous model for distributed computations and reducibility. On the contrary, a formal- 
ized approach to reduction in distributed computing is necessary for sound derivations of 
irreducibility results. 

In the first sections of this paper, we develop such a formal approach. As we are 
mainly interested in comparing the hardness of unsolvable tasks - this is similar to the 
study of polynomial-time reducibilities of problems which are not supposed to be solvable in 
polynomial time - we need to refer, in the definition of reducibility, to some deus ex machina 
for solving (algorithmically unsolvable) tasks. In other words, we grant the processes of a 
distributed system the ability to query a "black box" which supplies a correct answer, 
magically solving some specific distributed coordination task. Such a black box dedicated 
to solve some task is called, as in complexity theory, an oracle. Our model for oracles takes 
into account two specific features of distributed computing: (i) an oracle has to synchronize 
and coordinate the queries from the different processes in the system; (ii) in the context 
of systems where processes may exhibit failures, an oracle ought to answer even if some of 
the processes (the maximum number of which is the resiliency degree of the oracle) do not 
query it. 

Relying on our formal definition of oracle, we may introduce algorithms using ora- 
cles, and define various notions of reduction between agreement tasks based on the latter. 
Namely, we introduce analogs for distributed systems of the many-one and Turing reductions 
in recursive functions theory. Since their polynomial-time bounded versions in computa- 
tional complexity are the well-known Karp and Cook reductions, we call them K -reduction 
and C -reduction? We also introduce notions of reducibility which have no counterpart in 
classical (namely, non distributed) systems: the C*-reduction which arises naturally when 
dealing with symmetric tasks, and in Part II, the Failure-Information reduction, designed 
for the study of failure resilient tasks. 

Using this formalism, we may derive rigorous reducibility and irreducibility theorems. 
Our main result is an incomparability statement for Consensus and Atomic Commitment 
tasks: we show that they are incomparable with respect to the C-reduction, except when the 
resiliency degree is 1, in which case Atomic Commitment is strictly harder than Consensus. 
A side consequence of these results on the comparison between Consensus and Atomic Com- 
mitment is that the notion of C-reduction is strictly weaker than the one of if-reduction, 
even for unsolvable tasks. As shown in j^Uj, a similar situation arises for polynomial-time 
bounded reducibilities between problems in P. Note however that comparing Karp and 
Cook reductions on NP is still an open problem. 

Part II of this paper will consider the class of k- Threshold Agreement tasks introduced 
in which encompasses both Consensus and Atomic Commitment tasks. We generalize 

2 The notion of reducibility introduced in 1111 is however much more restrictive than the one intuitively 
used in the papers cited above: two problems Pi and P2 are equivalent with respect to the reducibility 
relation in |14| iff the sets of algorithms solving Pi and P2 essentially coincide, up to relabeling local states 
and padding messages. 

3 In the context of distributed systems, the terminology "many-one" and "Turing" reducibilities would be 
especially misleading. 
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the reducibility and irreducibility results established in Part I to these new agreement tasks. 
From these extensions, we derive new irreducibility results between Consensus tasks when 
varying the set of processes and the resiliency degree. 

Part I is organized as follows. In Section 2, following the general definition of decision 
tasks given in [23], we define agreement tasks, and then present the notion of symmetric 
agreement tasks. In Section 3, we introduce our formal definition of oracle; technically, 
it is convenient to distinguish between an oracle and its name, which we call its sanctu- 
ary and which we need to introduce before the corresponding oracle. We also explain the 
correspondence between our oracles and agreement tasks. In Section 4, we follow the com- 
putational model developed in |H] to describe a computational model for message-passing 
systems in which processes may consult oracles. Our model basically differs from the one 
in jS] by the fact that the computation unit called "step" is not atomic any more: taking 
a step in which it consults some oracle, a process may be blocked after querying the oracle 
in the case the latter does not answer. Section 5 defines the if -reduction, and establishes 
if -reducibility and irreducibility results. In Section 6, we present the C-reduction, and its 
symmetrized version, called C*-reduction. In Section 7, we examine Consensus and Atomic 
Commitment tasks, and their reducibility relations when varying the number of processes 
in the system. Our main results appear in Section 8 in which we prove that Consensus and 
Atomic Commitment tasks are generally incomparable. 

2 Failure patterns and agreement tasks 

Our model of computation consists of a collection II of n asynchronous processes, which 
communicate by exchanging messages. Communications are point-to-point. Every pair of 
processes is connected by a reliable channel. We assume the existence of a discrete global 
clock to which processes do not have access. The range of the clock's ticks is the set of 
natural numbers, and is denoted by T. 

2.1 Failures and failure patterns 

Processes may fail by crashing. A failure pattern F for II is a function F : T — ► 2 n , such 
that 

Vt eT, F(t)CF(t + l). (1) 

For any t £ T, F(t) represents the set of processes that have crashed by time t. li p £ F(t), 
we say that p is alive at time t, and condition (^Q) means that processes are assumed not to 
recover. 

Process p is faulty (with respect to F) if p G Faulty(F) = U t& rF{t)\ otherwise, p is 
correct and p 6 Correct(F) = U \ Faulty (F). 

We only consider failure patterns with at least one survivor, that is the failure patterns 
F such that \Faulty(F) < [ IX j . The set of these failure patterns for II is denoted by J-u- 

2.2 Agreement problems and agreement tasks 

We view an agreement problem as a mapping of possible inputs and failure patterns to sets 
of allowable decision values. Formally, let V be a set of input and output values, and II be 
a set of process names. An agreement problem P for II and V is given by a subset Vp of V n 
and a mapping 

P:f n x V P ^2 V \{0}. 
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Each element V G Vp represents a possible initial assignment of input values in V to the 
processes p G IT and is called an input vector of problem P. For any (F,V) G J-u X Vp, 
the non-empty subset P(F, V) of V represents the set of allowable decision values with the 
input vector V and the failure pattern F. 

For any v in V, the constant mapping V defined by V{p) = v, for every p G II, is denoted 
by v (to simplify notation, we omit reference to IT). 

The simplest agreement problem for II and V is Consensus, denoted Cons-^n- Its 
only requirement is that the decision value must be some process input value. Formally, 
Vcons v n = V n , and for each couple (F,V) G T\\ x V n , the set Consv,n(^, V) of allowable 
decision values is defined as the set of elements of V that occur in the input vector V . 

In the case of the binary consensus problem for IT, simply denoted Consn, we have 
V = {0, 1}, and the function Consn is defined by: 

• VF G JF n , V V G {0, l} n \ {0, 1} : Cons n (F, V) = {0, 1}; 
. VF G f n : Cons n (i ? , 0) = {0} and Cons n (-F, 1) = {1}. 

Another well-known agreement problem for IT is Atomic Commitment, denoted ACn- It 
may be described as follows in terms of the previous definitions: V = {0, 1}, VAC n = {A I} 11 ' 
and for any (F, V) G J-jr x {0, l} n , 

. ACu{F, V) = {0} if V ^1, 

• ACnCF, = {1} if V = 1 and Faulty{F) = 0, 

• AC n (F, V) = {0, 1} if V = 1 and Faulty(F) / 0. 

Classically, the input values are denoted by No and Yes, and processes may decide on Abort 
or Commit. In the previous definition, we have identified Yes and Commit with 1, and No 
and Abort with 0. 

For any set II of n process names, the data of an agreement problem P for IT and of 
an integer / such that < / < n — 1 define an agreement task. The integer / is called the 
resiliency degree of the task. The tasks with the maximum resiliency degree are classically 
called wait- free tasks. 

The distributed task defined by the Atomic Commitment problem for IT and the re- 
siliency degree / will be denoted AC(IT, /). Similarly, we shall denote Cons(IT, /) the task 
defined by the Consensus problem and the resiliency degree /. 



2.3 Renaming and symmetry 

Let II and IT' be two sets of n process names, and let $ : IT — > IT' be a one-to-one mapping. 
Such a map may be seen as a renaming of the processes in IT, and may be used to translate 
any input vector (or failure pattern, or agreement problem, or distributed task, ...) X on 
the set of processes II to one *I on the set of processes IT'. These transformations under 
renaming are bijective and satisfy the following composition property: if IT" denotes a third 
set of n processes, and : IT' — > IT" is a one-to-one mapping, then 

*'(*X)=*'°*X (2) 

Formally, these transformations are defined as follows: 

• given any vector V in V n , the vector ®V is the vector V o in V n '; 
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• for any failure pattern F for II, we let *P : t G T — > 3>(P(t)); 

• for any agreement problem P for II, *P is the agreement problem for II' defined by: 

V* P = {*V : R V P } and *P(*F* = P(P, F); 

• finally, for any agreement task T = (P, /) for II, we let *T = (*P, /). 

In the sequel, we denote by Cons(n, /) (resp. AC(n, /)) the /-resilient task defined 
by the Consensus (resp. the Atomic Commitment) problem for the set of process names 
II = {1, • ■ ■ , n}, that is: 

Cons(n, /) = Cons ({1, • • • , n}, /) 

and 

AC(n,/)=AC({l,-- - ,n},/). 
Clearly, for any set II of n processes and for any renaming $ : {1, • • ■ , n} — > II, we have: 

*Cons(?i,/) = Cons(II,/) (3) 

and 

*AC(n,/)=AC(n,/). (4) 

Using transformations under renaming, we may formally define the symmetry of an 
agreement problem P or of a distributed task T on some given set of processes II. Namely, 
P (resp. T) is symmetric when, for any permutation a of II, we have a P = P (resp., 
a T = T). Clearly, T = (P, /) is symmetric iff P is. 

In more explicit terms, the symmetry of P means that, for any permutation a of II, Vp 
is invariant by the permutation V -^ a V of V n , and for any failure pattern F for II and any 
input vector V in V P , we have P( a F, a V) = P(F, V). 

As a straightforward consequence of the composition property (J2J) , the symmetry prop- 
erty is invariant under renaming: if P (resp. T) is symmetric, then for any renaming 
<3? : II ^* II', *P (resp. *T) also is symmetric. Observe finally that (jSJ) and @ applied 
to permutations $ of {!,-•• ,n} show that Cons(n, /) and AC(n, /) are symmetric. By 
invariance of symmetry by renaming, this is equivalent to the symmetry of Cons(II, /) and 
AC(II, /) for any set II of processes. 

3 Sanctuaries, oracles, consultations 

3.1 Sanctuaries, consultations, and histories 

Informally, a distributed oracle for an agreement problem P is a black box that can be 
queried by processes with some input values for P, and that is capable of reporting a 
solution to P provided it has received sufficiently many queries. Each oracle is identified 
by its name, which we call the oracle 's sanctuary. 

Formally, we fix a set of values V, a set of process names II, and a finite set S of 
sanctuaries. Let r : S — > 2 n \ {0} be a function which assigns to each sanctuary a £ S a 
subset r(er) of II which represents the set of processes allowed to consult a. The elements 
of r(cr) will be called the consultants of a. 

An event at the sanctuary a G S is defined as a tuple e = (a, p, i, r, t> ), where p € r(a) 
is the process name of e, t € T is the time of e, r G {Q,A} is the fa/pe of e, and u G V is 
the argument value of e. 
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Let a G £ be any sanctuary; a history H of a is a (finite or infinite) sequence of events 
at a such that the times of events in H form a non-decreasing list. For any consultant p of 
cr, the subsequence of all events in H whose process names are p will be denoted by H \p. 
For any positive integer k, the k-th consultation in H, denoted by H k , is defined as the 
subsequence of all the events e in H such that for some p G r(cr), e is the fe-th query or 
the k-th answer event in H\p. A history H of the sanctuary a is well- formed if (i) for each 
process p G r(<r), the first event in H\p, when H\p is not empty, is a query event, (ii) each 
query event - except possibly the last one - is immediately followed by an answer event, 
and (iii) each answer event - except possibly the last one - is immediately followed by a 
query event. 

Let Fbea failure pattern for T(a). A history H of the sanctuary a is compatible with F 
if any process that has crashed by some time does not consult a anymore; in other words, 
for any (p,t) G II X T such that p G F(t), no event of the form (a,p,t', — , — ) with t' > t 
occurs in H\p. A 

3.2 Distributed oracles 

For each sanctuary a G E, let O a be a function which maps each failure pattern for r(<r) to 
a set of well-formed histories of the sanctuary a which are compatible with F. The function 
O a will be called the oracle of sanctuary a. Moreover, if P is an agreement problem for 
r(<r), we shall say that O a is an oracle suitable for P if for any failure pattern F for T(a), 
for any history H G O a (F), and for any positive integer k, the k-th. consultation in H 
satisfies the following two conditions: 

Agreement. The oracle answers the same value to all processes. Formally: 

(a, -, -, A, d)(£H k A (a, -, -, A, d!) G H k d = d! . 

F '-Validity. If the oracle answers a value to some process, then this value is allowed by 
P. Formally, if W denotes the partial input vector defined by H k (namely W(p) = v iff 
(cr,p, Q,f) G Hf.), and V any extension of W in Vp, any value d answered by O a in H k 
belongs to P(F, V). 

Finally, we shall say that the oracle O a is f -resilient if for any failure pattern F for 
r(cr), for any history H G O a (F), and for any consultation of O a in H with at least n — f 
query events, every correct process finally gets an answer from O a . Formally, O a is defined 
to be /-resilient if it satisfies: 

f -Resilience. For any failure pattern F for T(a), for any history H G O a (F), and for any 
integer k > 1, the k-th consultation in H satisfies: 

\/p G Correct(F) : \{e G H k : e = (a, -, -, Q, -)}| > \Y{a)\ - f (a,p, -, A, -) G H k . 

Observe that our oracles are suitable only for agreement problems. However, it is 
straightforward to extend their definition to oracles suitable for decision problems [I]. 5 

4 Throughout this paper, a "-" in a tuple denotes an arbitrary value of the appropriate type. 
In a decision problem, the sets of allowable decision values do not depend on failure patterns, and 
processes may decide differently. Renaming and k-Set Agreement problem are two well-known decision 
problems in which agreement is not required. 



6 



3.3 The oracle for an agreement task 

Let II be a set of n process names, and let T be the task denned by some agreement problem 
P and some integer /, < / < n — 1. To these data, we may naturally attach some /- 
resilient oracle suitable for P, in the following way. Its sanctuary - which, by definition, is 
a mere identifier - will be T itself, and it will be suggestive and typographically convenient, 
to denote O.T instead of Ot- The set of consultants T(o~) of the oracle O.T will be II itself, 
and for any failure pattern F for LT, we shall define O.T(F) as the set of all well-formed 
histories H (of the sanctuary T) which are compatible with F, and satisfy the agreement, 
P- validity, and /-resilience conditions. 

Clearly, O.T is the "most general" /-resilient oracle for P, in the sense that for any 
/-resilient oracle O for P, and for any failure pattern F, we have O(F) C O.T(F). 

The following properties of the oracles for Consensus and Atomic Commitment tasks will 
be useful in the sequel. They are straightforward consequences of the Cons- and AC- validity 
conditions (cf. Sections 12 . 21 and 13 . 2(1 . 

Ocons In any consultation of an oracle suitable for Consensus, if all the queries have the 
same value v, then the only possible answer of the oracle is v. 

Oac In any consultation of an oracle suitable for Atomic Commitment, the oracle is allowed 
to answer 1 only if all processes query the oracle, and all the query values are 1. 

3.4 Related notions 

The notion of oracle already appears at various places in the literature on distributed 
computing. Indeed, it has been used in an informal way first for randomization (see 
also |1U) where it occurs under the name of coin) , and then for failure detectors jH] • In both 
cases, an oracle is supposed to answer upon any query by some process. Such an oracle has 
a maximal resiliency degree (namely n — 1, if n is the number of processes which may query 
the oracle). 

This is not the only point in which random and failure detector oracles differ from 
ours. In the case of failure detector or randomization with private coins - as in Ben-Or's 
algorithm [2] - the oracle is totally distributed and does not coordinate the various queries 
from processes. For this type of oracle, there is no notion of consultation. Observe however 
that randomization with a global coin - as in Bracha's algorithm - underlies a notion of 
oracle which is closer to ours since all the processes see the same outcome. 

Interestingly, the fundamental concept of shared object introduced by Herlihy has 
some common flavor with our oracles. Indeed, an object of type consensus ^Bl in a system 
II with n processes coincides with our oracle O. Cons (II, n — 1). The generalization of the 
notion of shared object proposed by Malki et al. in [22] turns out to be yet closer: our 
/-resilient oracles for LT actually correspond to /-resilient shared objects of |22| with an 
access list II and only one operation. 

4 Algorithms using oracles 

In this section, we fix V, S, LT, two non-empty subsets IT and LT2, of II, Y : £ — > 2 n2 \ {0}, 
and a family (C^o-eE °f oracles as defined in Section |3J 
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4.1 Steps, events, and local histories 

We model the communication channels as a message buffer, denoted /3, that represents the 
multiset of messages that have been sent but not yet delivered. A message is defined as 
a couple (p,m), where p is the name of the destination process, and m is a message value 
from a fixed universe M. 

An algorithm for II i using the oracles of the sanctuaries in E is a function A that maps 
each process name p £ Ej to a deterministic automata A(p). The computation locally 
proceeds in steps. Each step of A(p) consists in a series of different phases: 

Message Receipt. Process p receives a single message of the form (p, m) from (3. 

Oracle Query. Process p queries a single oracle O a , p £ r(c), with some value v £ V. 

Oracle Answer. Process p gets an answer d £ V from the oracle that p consults in this step. 

State Change. Process p changes its local state, and sends a message to a single process or 
sends no message, according to the automaton A(p). These actions are based on p's state 
at the beginning of the step, the possible message received in the step, and the possible 
value answered by the oracle. 

In every step, p may skip the two intermediate phases {p consults no oracle); it may also 
skip the first phase (p receives no message). So there are four kinds of steps with one, two, 
three, or four phases, whether p receives or not a message, and whether it consults or not 
an oracle. 

The message actually received by p in the Message Receipt phase is chosen nondeter- 
ministically amongst the messages in (3 that are addressed to p. Process p may receive no 
message even if (3 contains messages that are addressed to p. Indeed, we model asynchronous 
systems, where messages may experience arbitrary (but finite) delays. 

Besides, the fact that p is allowed or not to consult an oracle in some step is totally 
determined by the local state of p at the beginning of the step. Moreover, in the case of a 
local state in which p consults an oracle, the name of the oracle (i.e., the sanctuary) is also 
completely determined by the local state. A step is thus uniquely determined by (1) the 
name p of the process that takes the step, (2) the message m (if any) received by p during 
that step, and in the case p consults an oracle in the step, (3) the value answered by the 
oracle. We may therefore identify a step with a triple [p, m, d], where p £ IT, m £ MU{null} 
with m = null if p receives no message in the step, and d £ VU{1} with d = _L if p consults 
no oracle in the step. Given a local state state p of p, we say that the step s = \p,m, d] is 
feasible in state p in the two following cases: 

1. d is in V and p has to consult an oracle in state p ; 

2. d = _L and p is not allowed to consult any oracle in state p . 

We denote by s(state p ) the unique state of p that results when p performs the step s in the 
state state p . 

This description of a step leads to generalize the definition of events given in Section \?>. 21 
and to consider two new types of events: ((3,p, i,R, m) and (f3,p,t,S,m') - R stands for 
"Receive", and S for "State change" - where p £ IIx, t £ T, m £ M, and m' £ MU {null}. 
A step is thus a series of one, two, three, or four events of the following form: 



S 



1. ({P,p,t,S,m')) 

2. {{(3,p,t,R,m);(f3,p,t,S,m')) 

3. ((a,p,t,Q,v); (a,p, t, A, d); ((3,p,t,S,m f )} 

4. ((/3,p, t, R, m); (cr,p, t, Q, u); (a,p, t, A, d); t, S, m'))- 

4.2 Histories and runs 

A history of process p is a (finite or infinite) sequence of events whose process names are p, 
and such that the times of events in this sequence form a non-decreasing list. A history H p 
of process p is well-formed if the events in H p can be grouped to form a sequence of steps, 
except possibly the last events which may only form a prefix of a step with the message 
receipt and oracle query phases (the oracle answer and state change phases may be both 
missing). The resulting sequence of complete steps in H p is denoted H p . 

For every failure pattern F for IIi and every sanctuary a G E, we define the failure 
pattern F a for T(a) by 

F a (t) = (F(t)nT(a))U(T(a)\U 1 ), 

i.e., F a consists of the consultants of a which are either faulty with respect to F or not in 
the membership of IIi . 

Let Fbea failure pattern for LTi; a history H p of process p is said to be compatible with 
F if any process that has crashed by some time in F performs no step afterwards; in other 
words, for any (p, t) G IIi X T such that p G F(t), no event of the form (— ,p, t 1 , — , — ) with 
t' > t occurs in H p . 

A history H = (ej)j>i of the algorithm A is a (finite or infinite) sequence of events such 
that their times (i«)i>i form a non-decreasing sequence in T. The subsequence of all events 
in H whose process name is p is denoted by H \p. Similarly, H\a denotes the subsequence 
of events in H related to the sanctuary a. 

We assume that initially, the message buffer (3 is empty and every process p is in an 
initial state of A(p). 

From history H, we inductively construct the sequence (state p[i])i>o in the following 
way: (a) state^} = 0, and (b) if e« = (/?, — ,tj,R, m), then statep[i\ = statep[i — 1] \ {m}, 
and if ei = (/?, — , tj, S, m') with m! ^ null, then statef}[i] = statep[i — 1] U {m 1 }; otherwise, 
ei is an event that does not modify statep[i], i.e., statep[i] = statep[i — 1]. 

A run of A is a triple p =<F, I, H> where F is a failure pattern for LTi, / is a function 
mapping each process p to an initial state of A(p), and H is a history of A that satisfy the 
following properties Rl-6: 

Rl For every sanctuary a G E, the subhistory H\a is a history of the sanctuary a which is 
both well- formed and compatible with F a . Formally, 

Vu G X : H\a G O a {F a ). 

R2 For every process p G IIi, the subhistory H\p is a history of the process p which is both 
well-formed and compatible with F. 

R3 Every message that is delivered by p has been previously sent to p. Formally, 

Vm G M : (/3,p,ti,R,m) G H (m = (p, -) A m G state^i - 1]). 
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R4 Every step in H is feasible. Formally, state p [0] = I{p) and for every process p, H\p[l] 
is feasible in state p [0], H\p[2] is feasible in state p [l] = H\p[l](state p [0]) , etc ... 

To state our two last conditions, we need to introduce the notion of "process locked in 
a sanctuary" . 

An answer event matches a query event if their process names and their oracle names 
(sanctuaries) agree. A query event is pending in a history if no matching answer event follows 
the query event. We say that process p is locked in the sanctuary a during p =<F, I, H> 
if p G Correct(F) and there is a pending query event of the type (cr,p, — , Q, — ) in H. We 
denote by Locked(p) the set of processes in III which are locked in some sanctuary of S 
during p. 

R5 Every correct process that is not locked in any sanctuary takes an infinite number of 
steps. Formally, 

Vp G IIi, Vi : P G Correct(F) \ Locked(p) =4> 3j > i : ( — ,p,tj, — , — ) G H. 

R6 Every message sent to a correct process that is locked in no sanctuary is eventually 
received. Formally, 

G IIi, Vi : (p G Correct(F) \ Locked(p) A m = (p, — ) G statep\i\) 

(3j > i : (/3,p,tj,R,m) G H). 

Observe that conditions Rl-6 are not independent (for instance, the compatibility re- 
quirement in Rl is implied by R2). 

4.3 Terminating algorithms 

So far, we have not made any provision for process stopping. It is easy, however, to distin- 
guish some of the process states as halting states, and specify that no further activity can 
occur from these states. That is no messages are sent and the only transition is a self-loop. 6 
An algorithm A is said to be terminating in the presence of f failures if in any run of A 
with at most / failures, every correct process eventually reaches a halting state. 

It is important to notice the difference between the fact that a process may make a 
decision and the fact that it may cease participating to the algorithm, that is it may 
halt ^3]. Indeed, as shown by Taubenfeld, Katz, and Moran [2^] for initial crashes or 
by Chor and Moscovici for the randomized model, solvability results for decision tasks 
highly depend on whether processes are required to terminate (after making a decision) or 
not. 

The definition of solvability given above does not include the termination requirement. 
However, in the case of agreement tasks, solvability does imply solvability with termination. 
To show that, it suffices to see that any algorithm solving an agreement task T can be 
translated into a terminating algorithm which solves T too. Let A be any algorithm solving 
T; we transform A in following way: 

1. as soon as a process makes a decision in A, it sends its decision value to all and then 
it halts; 

6 Note that these halting states do not play the same role as they do in classical finite-state automata 
theory. There, they generally serve as accepting states, which are used to determine which strings are in the 
language computed by the machine. Here, they just serve to model that processes halt. 



10 



2. upon the receipt of a decision notification, a process stops running A, decides on the 
value it has received. In turn, it sends its decision value to all, and then halts. 

Clearly, the resulting algorithm B solves T and every correct process eventually terminates. 
The corresponding definition of the automata B{p) from A{p) is trivial, and so omitted. 

4.4 Algorithms for agreement tasks 

In the context of agreement problems, each process p has an initial value in V and must 
reach an irrevocable decision on a value in V. Thus for an agreement problem, the algorithm 
of process p, A(p), has distinct initial states sp* indexed by v E V, s v p signifying that p's initial 
value is v. The local algorithm A(p) also has disjoint sets of decision states S p , d E V. 

We say that algorithm A for 111 using oracles of the £ sanctuaries solves the agreement 
task (P, /) for 111 if every run p =<F, I, H> of A where F is a failure pattern with at most 
/ failures satisfies: 

Termination. Every correct process eventually decides some value. Formally, 

Vp E Correct(F),3i : state p [i] E (J S^. 

Irrevocability. Once a process makes a decision, it remains decided on that value. Formally, 
Vp E rii,Vd E V,Vi, j :(i<j A state p [i] E S*) state p [j] E S*. 

Agreement. No two processes decide differently. Formally, 

\/p,p E III, Vz, Vd, d' E V : (state p £ S p A state p > E Sf,) ^d = d'. 

P- Validity. If a process decides d, then d is allowed by P. Formally, let V denotes the 
vector of initial values defined by I. 

y P E IIi, yd E V : {3i : state p [i] E S$) d E P{F, V). 

5 Reduction a la Karp: K-reduction 

We need a precise definition of what it means for a task to be at least as hard as another 
one. For that, we first propose the notion of reduction a la Karp, an analog for distributed 
computing of the classical Karp reduction. Informally, task T\ K-reduces to task Ti if, to 
solve Ti, we just have to transform the input values for T\ into a set of inputs for T2, and 
solve T2 on them. When this holds, we shall say that T2 is (at least) as hard as T\. We 
shall prove that in synchronous systems, Consensus tasks are strictly harder than Atomic 
Commitment tasks, with respect to iT-reduction. 

5.1 K-reduction 

Definition 5.1 Let T\ and T2 be two tasks for a set H of processes. We say that T\ is 
K -reducible to T2, and we note T\ <kT2, if there is an algorithm for T\ in which each 
correct process p inU (1) transforms any input value v p into some value w p without using 
any oracle, and (2) queries the oracle O.T2 with w p , gets an answer value d from the oracle, 
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and finally decides on d. The first part R of the algorithm which transforms every vector V 
for T\ into the partial vector W = R(V) using no oracle is a terminating algorithm called 
a K -reduction from T\ to T2. 

As explained in the Introduction, we expect that if some task is reducible to a second 
solvable task, then we can obtain a solution for the first one. This is satisfied by if -reduction, 
as shown by the following proposition. 

Proposition 5.2 lfT\ K -reduces to T2 and T2 is a solvable task, then T\ is solvable. 

Proof: If T2 is solvable, then there is an algorithm A using no oracle which solves T2. Thus 
we may replace the oracle O.T2 by A, just after R terminates. The resulting algorithm uses 
no oracle and solves T\. □ 

Clearly if -reduction is reflexive; moreover it is transitive, namely if T\, T2, and T3 are 
three tasks for II such that T\ <k^2 and T2 <jf T3, then T\ <kT%. Thus it orders tasks 
with respect to their difficulty. 

Let T\ and T2 be two agreement tasks, and let fa and fa denote their resiliency degrees, 
respectively. From Definition 15.11 it follows that if T\ <k I2, then O.T2 definitely answers 
when fa processes do not query it. This implies that fa < fa. 

Conversely, it is immediate to see that the more a task is resilient, the harder it is to 
solve it. Formally, if T\ and T2 are defined by the same agreement problem and fa < fa, 
then T\ <k T%. Therefore, in the case of two agreement tasks defined by the same problem, 
task Ti if -reduces to T2 if and only if fa < fa. 

Notice that the key point for proving some if -reduction between two agreement tasks 
lies in the validity condition. Indeed, let T\ = (Pi, fa) and T2 = (P2, fa) be two agreement 
tasks for some set IT of processes. Assume that fa < fa (cf. discussion above). Let R be any 
algorithm running on IT in which, starting from any input vector V for T\, every correct 
process p eventually outputs some value w p € V. We consider the algorithm resulting from 
the query of the oracle for T2 with the output values of R, and study whether this algorithm 
solves or not T\. Irrevocability is obvious; agreement is also trivial (Ti and T2 share this 
condition). Because no process may be blocked in R and because CT2 is an ^-resilient 
oracle, termination is guaranteed in any run with at most fa, and so fa, failures. Hence, 
showing R is a if -reduction from T\ to T2 actually consists in proving that the answer given 
by O.T2 ensures that the Pi-validity condition is satisfied. 

5.2 K-reducibility between Consensus and Atomic Commitment tasks 

We first establish a if -reduction result in the particular case of synchronous systems. Recall 
that in such systems, one can emulate a computational model in which computations are 
organized in rounds of information exchanges. On each process, a round consists of message 
sending to all processes, receipt of all the messages sent to this process at this round, and 
local processing (see Chapter 2 in |21j for a detailed presentation of this computational 
model). 

Theorem 5.3 In the synchronous model, for every integers n, f such that < / < n — 1, 
AC(n, /) is K -reducible to Cons(n, /). 

Proof: Consider the one round algorithm R in Figure ^ which transforms every input 
value v p into 1 if process p detects no failure (i.e., p receives exactly n messages) and all 
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Code for process p 



send (v p ) to all 

Receive all messages (v q ) sent to p 

if received n messages with value 1 then w p := 1 

else w p := 

Figure 1: A iT-reduction from AC(n,/) to Cons(n, /) in the synchronous model. 



the values that p receives are equal to 1; otherwise, R transforms v p into 0. We claim that 
R is a if-reduction from AC(n, /) to Cons(n, /). 

As mentioned above, we just have to address validity. If no failure occurs, then every 
process receives n messages in the one round algorithm R. Therefore if all the v p s are equal 
to 1 and no process fails, then all the w p s are set to 1. Every process that is still alive 
queries the oracle for Cons(n, /) with value 1, and so by the validity condition of consensus, 
the oracle answers 1. On the other hand, suppose at least one process starts with 0. Each 
process p receives less than n messages in R or receives at least one message with value 0. 
In both cases, w p is set to 0. All processes query the Cons(n, /) oracle with value 0; by 
the validity condition of consensus, the oracle definitely answers 0. Therefore the validity 
condition of Atomic Commitment is satisfied. □ 

Conversely, we prove that if / > 1 then Cons(n, /) is not ^-reducible to AC(n, /), and 
so Cons(n, /) is strictly harder to solve than AC(n, /) in synchronous systems. 

Theorem 5.4 For any integers n, f such that 1 < / < n — 1, Cons(n, /) is never Ir- 
reducible to AC(n, /), even in synchronous systems. 

Proof: For the sake of contradiction, suppose that there exists a ET-reduction R from 
Cons(n, /) to AC(n, /) in the synchronous model, and consider the resulting algorithm for 
Cons(n, /). 

We consider a failure free run p of this algorithm which starts with the input vector 
1; let p be the name of the process which terminates R last in this run, and let r p denote 
the round number when p completes the computation of w p in this run. Let F denote the 
failure pattern such that all processes are correct, except p which crashes just at the end of 
round r p . Now there is a run p' of the algorithm for Cons(n, /) whose failure pattern is F, 
which starts with the input vector 1, and such that every process has the same behavior by 
the end of round r p in p' as in p. In p' , process p does not query the AC(n, /) oracle which 
therefore definitely answers (cf. property Oac i n Section l3~3|) . Consequently, the validity 
condition of consensus is violated in p' , a contradiction when / > 1. 

This proves that Cons(n, /) is not if-reducible to AC(n, /) in synchronous systems, and 
so in asynchronous systems. □ 

6 Reductions a la Cook: C-reduction and C*-reduction 

In Section 13 we introduced the notion of i^-reducibility as a specific way of using a solution 
to one task to solve other tasks: if T\ is i^-reducible to T2, and we have a solution for T2, 
we obtain a solution for T\ just by transforming the input values for T\ into input values 
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for T2. Such a reducibility notion is very restrictive: just one solution for T2 can be used 
to design a solution for Ti, and just in the end. 

We now propose some weaker notion of reduction in which every process is allowed to 
query the oracle several times, and not just in the end as with ^-reduction. Actually, we 
define two such notions of reduction: a first one which applies to arbitrary tasks for some 
given sets of processes, and a second one which makes sense for symmetric tasks. These 
are analogs for distributed computing of the classical Cook reduction, and will be called C- 
and C* -reduction. 

6.1 C-reduction 

Consider the following data: 

• a finite set of process names II; 

• a family {IHj : o~ £ £} of subsets of II, indexed by a finite set X (the sanctuaries), 
and, for any a € X, a task T% for n|; 

• a finite subset II \ of II, and a task T\ for ITi . 

Definition 6.1 We say that T\ is C-reducible to {T% '■ o~ £ £}, and we note 

t x < c m ■■ <? e 

if there is an algorithm R for T\ using the oracles {O.TZ ■ o~ £ X}. The algorithm R is 
called a C-reduction from T\ to '■ o~ £ X}. 

Often, we deal with a set of sanctuaries reduced to a singleton: the family {T^ : er € £} 
is then given by one task T2, and we simply say that "Tj is C-reducible to T2", and write 
Ti< c T 2 . 

This notion of C-reducibility is transitive in the following strong sense: Consider a set 
of sanctuaries T, and for any r € T, a set of "affiliated sanctuaries" S(r). Let 

S T = |J{r}xS(r). 

reT 

Assume that a task T\ is C-reducible to a family of tasks {TJ : r € T}, and that, for 
any r 6 T, the task is C-reducible to a family of tasks {T^' T : a 6 S(r)}. Then Ti is 
C-reducible to the family of tasks {T^' T : (r, cr) E S T }. 

In particular, restricted to single tasks, C-reducibility is transitive in the usual sense: if 
7i <c T2 and T2 <c T3, then Ti <c T3. It is also clearly reflexive. 

C-reduction satisfies our intuitive concept of reducibility as shown by the following 
proposition. 

Proposition 6.2 If T\ C-reduces to {Tg : a € X} and every task T% is a solvable task, 
then T\ is solvable. 

Proof: Let it! be a C-reduction from T\ to {T^ : a € X}. Since T^ is solvable, there 
exists an algorithm B a using no oracle which solves T%. As explained in Section \4.'M we 
may suppose that B a is a terminating algorithm. 

Let p be a process in IIi, and let state p be any state of R(p) in which p consults some 
oracle C.T^ with the query value v. Consider a transition of R(p) from state p corresponding 
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to step [p, m, d]. Since B a terminates, there is no problem to replace this transition in R(p) 
by the (possibly empty) sub-automata of B a (p) consisting of the states which are reachable 
from the initial state Sp and leading to some halting states in Sp. Thus we may replace the 
algorithm for T\ using the set of oracles {O.T 2 ff : a £ £} by an ordinary algorithm (using 
no oracle) that solves T\. □ 

Combined with the transitivity of C-reduction, this latter proposition shows that like 
K-reducibility, C-reducibility orders tasks with respect to their difficulty. 

From Definition 16.11 it is straightforward that i'T-reducibility is at least as strong as C- 
reducibility: if T\ and T2 are two tasks for II such that T\ <k Tii then T\ <c T2. However, 
some major differences between K- and C-reducibility should be emphasized. Firstly, the 
flexible use of oracles in the definition of C-reduction allows us to compare tasks for different 
sets of processes, whereas any two tasks comparable with respect to the if-reduction are 
necessarily tasks for the same set of processes. Secondly, note that if T\ is any solvable 
task, then it is C-reducible to any task T2; this would not hold for ET-reducibility (cf. 
Theorem 15.4(1 . 

This latter remark shows that for any two solvable tasks T\ and T2, we have both 
T\ <c T<i and T2 <c T\. In other words, two solvable tasks are equivalent with respect to C- 
reducibility. Actually, C-reduction discriminates unsolvable tasks and is aimed to determine 
unsolvability degrees. Since we focus on agreement problems which are all solvable in the 
absence of failure, from now on we shall assume that the resiliency degree of tasks is at 
least 1. 

6.2 C*-reduction 

Let IIi and II2 be two sets of n\ and n<i processes, respectively. Let T\ be a task for IT, and 
T2 be a symmetric task on II2. Under these assumptions, we can define a weaker notion of 
reduction involving a "symmetrization of T2 inside HY'. 

Formally, for any subset II of IT of cardinality n<i and any one-to-one mapping $ from 
H2 onto II, consider the task *T2 for II. Since T2 is symmetric, ®T-i is invariant under 
permutation of IT, and so only depends on II (and not on the specific choice of the mapping 
$ : H2 — > II). This allows us to denote this task n T2 instead of *Z~2- 

Definition 6.3 We say that T\ C*-reduces to T2, and we note T\ <c* T2, if we have 

Ti < c { n T 2 : ncni and |II| = n 2 }. 

Actually, in the sequel we shall deal with this notion only when T\ is also symmetric. 
Observe that C*-reduction is an interesting notion only in the case n\ > n^. when n\ = rt2 
(resp. n\ < 712), T\ C*-reduces to T2 iff after any renaming IIi — > II2, the task T\ C-reduces 
to T2 (resp., iff T\ is solvable). 

Notice that the possibility of introducing a second notion of reducibility a la Cook, 
namely the C*-reduction, besides C-reduction, relies basically on the existence of several 
"partial renamings" 1I2 II(^-> LIi) ; it is thus inherent to the distributed nature of the 
computations we deal with, and has no counterpart in the classical complexity theory. 

From the strong transitivity property of the C-reduction, we derive that <c* is a 
transitive relation. It is clearly reflexive. 

Finally, assume that II2 C IIi. Then it is straightforward that for T\ and T2 as above, 
if T\ <c T2 then T\ <c* T2. In Section mi we shall show that except in the case IIi = II2, 
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Code for process p : 

initialization: 

d p G V U {_L}, initially _L 

for i = 1 to m do: 
if p 6 II; then 

Query(O.AC(U t J)){v p ) 

Answer(O.AC(ILi, f))(wi) 

Send((i,Wi)) to all 
wait until [Receive ((i,Wi)) for all i G {1, • • • , m}] 
rf p := max t=1 ... im (w,) 
Decide (d p ) 

Figure 2: A C*-reduction from Cons(n + /, /) to AC(n, /). 



the converse does not generally hold. Interestingly, in Part II, we shall exhibit classes of 
distributed tasks for which these two reductions turn out to coincide. 

6.3 A first example: Cons(n + f , f ) is C -reducible to AC(n, f) 

We now see a first example of C*-reduction, showing how to extract Consensus from Atomic 
Commitment. 

Let II be a set of n + / process names. We consider the m = ( n ^) subsets of II 
of cardinality n. Let us fix an arbitrary order on these subsets IIi, • • • , IT m , and a set of 
sanctuaries {1, • • ■ , m}. In Figure lo\31 we give the code of a simple Consensus algorithm for 
LT using the oracles CAC(ITi, /), • • • , CAC(IT m ,/). Informally, every process p consults 
these oracles with its initial value v p , according to the order 1, ■ ■ ■ ,m, and skipping the 
indexes i for which p ^ Ilj. As soon as p gets a response from an oracle, it broadcasts it in 
LT. Eventually, it knows all the values answered by the oracles (including those that it has 
not consulted), and then decides on the greatest value. 

Theorem 6.4 Let n, f be any positive integers, 1 < / < n — 1, and let LT be a set of n + f 
processes. The algorithm in Figure 2 solves the task Cons(IT,/) ; and so Cons(n + /, /) 
C*-reduces to AC(n, /). 

Proof: We first prove the termination property. By a simple induction on i, we easily show 
that every oracle 0.AC(IL, /) is consulted by at least | H» j — f = n — f processes, and so 
no process is blocked in the sanctuary i. Every correct process p G Ilj thus gets an answer 
from the oracle CAC(IIj,/), and then broadcasts it in IT. Since n > f + 1, the subset 
LTj contains at least one correct process. Therefore every correct process eventually knows 
the m values answered by the oracles O.AC(ITi, /),••• , O.AC(II fn , /), and then makes a 
decision. 

Irrevocability is obvious. Agreement follows from the decision rule and the fact that 
every process which makes a decision knows the values answered by all the oracles. 

For validity, if all the initial values are 0, then every oracle is queried with value by 
at least one process, and so answers value 0. Therefore, the decision value is 0. 

Suppose now that all the processes in IT start with initial value 1. Since at most / 
processes are faulty, there is at least one subset ITj in which all processes are correct. Then 
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the answer given by the oracle CAC(ITj,/) is 1. By the decision rule, it follows that the 
decision value is 1. □ 



7 C^-reductions when varying the number of processes 

We now investigate various C- and C*-reductions between tasks associated to the same 
agreement problem, but which differ in the cardinalities of the sets of processes for which 
they are defined (and in their resiliency degrees as well). The reductions we describe are 
simple; their correctness proofs are straightforward and will be omitted. 

The first reduction we shall give solves AC(n + 1, /) using two oracles of AC(n, /) type. 
Interestingly, only one such oracle is not sufficient to solve AC(n + 1, /). In other words, 
we prove that although AC(n + 1,/) is C*-reducible to AC(n, /), it is not C-reducible to 
AC(n,/). 

We then establish that Cons(n, /) falls between Cons(n + 1, / + 1) and Cons(n + 1, /) 
with regard to the ordering <c ■ 

7.1 AC(n + l,f) C -reduces but does not C-reduce to AC(n, f) 

Let IT be any set of n + 1 processes; each process q € H has an initial value x q E {0, 1}. 
Let us consider any two different subsets II' and II" of II with n processes. (Without 
loss of generality, we could have assumed that II = {1, • • • , ra + 1}, IT' = {1, • • ■ , n}, and 
n" = {2, -- ,n + l}). 

We first sketch a simple algorithm running on II which uses both CAC(n',/) and 
0.AC(IT', /): Every process q first queries 0.AC(IT, /) if q e IT, and then queries O.AC(lT', /) 
if q S II". Each of these two oracles is consulted by at least n — f processes, and so eventu- 
ally answers. Let d! and d" be the responses of CAC(IT, /) and O.AC(H" , /), respectively. 
Every process in II' that is still alive sends d' to all processes in II; similarly, every process 
in II" broadcasts d" . As / < n, IT and IT" both contain at least one correct process, and 
so every process in II eventually knows both d' and d" . Finally, every alive process decides 
on d = mm(d' , d"). This establishes: 

Proposition 7.1 If n and f are two integers such that 1 < / < n — 1, then AC(n + 1,/) 
is C* -reducible to AC(n, /). 

We shall now prove that AC(n + 1, /) is not solvable just using the oracle O.AC(n, /). 
This result will demonstrate that C-reducibility is actually a stronger notion than C*- 
reducibility. 

Proposition 7.2 If n and f are two integers such that 1 < / < n — 1, then AC(n + 1,/) 
is not C-reducible to AC(n, /). 

Proof: As before, let II be a set on n + 1 processes and II' a subset of II with n processes. 

For the sake of contradiction, suppose that there is an algorithm R using the oracle 
O.AC (IT, /) which solves AC (II, /). Let p be the unique process in II \ II'. Consider a run 
p =<F, I, H> of R such that, for any qr G II, I(q) = sL and for any t E T, F(t) = {p}. In 
other words, p is a run of R in which all processes start with initial value 1 and no process 
is faulty except p which initially crashes. Let d denote the decision value in p. 
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We now prove that d = 0. For that, we introduce the mapping V on II which is identical 
to I over II \ {p} and satisfies I'(p) = s®, and we consider p' =<F, I', H>. We claim that p' 
is a run of R. Since p is a run, it is straightforward that p' satisfies Rl, R2, R3, R5, and R6. 
By an easy induction, we see that for any process q, q ^ p, the sequence of the local states 
reached by q are the same in p' as in p. This ensures that every step in H is feasible from 

and so R4 holds in p' . Thus, p' is a run of R, and by the validity condition of Atomic 
Commitment, the only possible decision value in p' is 0. This shows that d = 0. 

Next we construct a failure free run of R for which the history begins as H, up to the 
moment all processes in II' have made a decision. To achieve that, we need the following 
lemma, where Fq denotes the failure pattern with no failure (defined formally by Fo(t) = 0, 
for any t € T), and H [0, t] denotes the prefix in H of events with time less or equal to t. 

Lemma 7.3 For any to € T, there exists an extension Hq of H[0, to] such that <Fq,I, Hq> 
is a failure free run of R. 

Proof: The history Ho is constructed in stages, starting from H [0, to}. Each stage consists 
in adding zero or one event. A queue of the processes in II is maintained, initially in an 
arbitrary order, and the messages in (3 are ordered according to the time the messages were 
sent, earliest first. 

Suppose that the finite history Hq[0, t] extending H[0, to] is constructed. Let t + denote 
the successor of t in T, and let q be the first process in the process queue. After Ho[0,t], q 
may achieve only one type T of event. There are three cases to consider: 

1. T = S or T = Q. The automaton R(q) entirely determines the event e = q, t + , S, m) 
or e = (AC(n', /), q, t + , Q, v) which q may achieve at time t + . 

2. T = R. In this case, the message buffer (3 contains at least one message for q. Then 
we let e = q, t + , R, m), where m denotes the earliest message for q in (5. 

3. T = A. Form the successive consultations of CAC(IT, /) in Hq[0, t], and focus on the 
latter consultation. There are three subcases: 

Case 1: O.AC(TL',f) has yet answered some value d. 

In this case, we let e = (AC(lT, /), q, t + , A, d). 
Case 2: C.AC(II', /) has not yet answered, but has been queried by all processes in II'. 

We let e = (AC (IT, f),q, t + , A, d), where d denotes the minimum of all the query 

values. 

Case 3: 0.AC(IT, /) has not yet answered and has not yet been queried by some 
processes in IT'. 

In this case, we skip g's turn and no event is determined in this stage. 

If the above procedure determines an event e, then we let Ho[0,t + ] = Ho[0,i\; e (where 
semicolon denotes concatenation). Otherwise we are in Case 3.3, and we let #o[0,i + ] = 
Ho[0,t]. Process q is then moved to the back of the process queue. 

This inductively defines Hq. By construction, po = <Fo, I , Hq> satisfies Rl-6, and so is 
a failure free run of R. ^Lemmai '.3 

We now instantiate to to be the time when the last process makes a decision in p. The 
lemma provides an extension Ho of H[0, to] such that po =< Fq,I, Ho > is a run of R. The 
decision value in po is 0, which contradicts the fact that processes must decide on 1 in a 
failure free run of an Atomic Commitment algorithm in which all processes start with initial 
value 1. □ 
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7.2 Cons(n + l,f) is C-reducible to Cons(n, f) 

Contrary to what happens with Atomic Commitment, when dealing with Consensus, a 
decision value for a restricted subset of processes may always be adopted by all processes 
to make a global decision. In other words, a process kernel may impose a common decision 
on the whole system without violating validity for a general consensus. 

Let II be any set of n + 1 processes; each process p € II starts with an initial value 
x p € {0, 1}. We fix any subset II' of II with n processes, and we consider the oracle 
O.Cons(n', /) which may be consulted by any member of the process kernel II'. A C- 
reduction from Cons(IT, /) to Cons(n', /) is as follows: Every process p in IT' first queries 
CCons(n / ,/) with its initial value x p . All the correct processes in II' eventually get a 
common answer d since at most / processes may be prevented from querying the oracle 

0. Cons(II', /). Then every process in II' that is still alive sends d to all processes in IT. As 
/ < n — 1, II' contains at least one correct process, and so every process in II eventually 
receives d. Finally, every alive process decides on d. By property Ocons (Section I3.3|) . 
validity of Consensus is satisfied. This establishes: 

Proposition 7.4 If n and f are two integers such that 1 < / < n — 1, then Cons(ra + 1, /) 
is C-reducible to Cons(n, /). 

7.3 Cons(n, f) is C-reducible to Cons(n + l,f + 1) 

At this point, one may wonder whether conversely, Cons(n, /) is C-reducible to Cons(n + 

1, /). A negative answer to this question will be given in Part II, thanks to the introduction 
of the new class of k- Threshold Agreement tasks (7j . 

Instead of comparing Cons(n, /) with Cons(n+l, /), we may consider the a priori harder 
task Cons(n + l, / + 1), and show that Cons(ra, /) is indeed C-reducible to Cons(n+l, / + 

Let II be any set of n + 1 processes, and let II' be any subset of II with n processes. The 
C-reduction from Cons(n', /) to Cons(II, / + 1) is trivial: Each process in II' just needs to 
query the oracle CCons(II, / + 1) with its initial value. The oracle definitely answers since 
it is consulted by at least n — f = (n + 1) — (/ + 1) processes. Every process finally decides 
on the value provided by O. Cons (II,/ + 1). By property Oconsi validity of Consensus is 
satisfied. This establishes: 

Proposition 7.5 If n and f are two integers such that 1 < / < n — 1, then Cons(n, /) is 
C-reducible to Cons(n + 1, / + 1). 

Proposition 17.51 states that Cons(n + 1,/ + 1) is at least as hard as Cons(n, /), which, 
by Proposition 17.41 is at least as hard as Cons(n + 1,/). In other words, Cons(n, /) is 
sandwiched between Cons(n + 1, /) and Cons(n + 1, /+ 1) with respect to the ordering <c ■ 

By Proposition 17.51 it follows that if Cons(n, n — 1) is C-reducible to some task T, 
then every task Cons(m,m — 1), with m not greater than n, also C-reduces to T. Thus 
to any distributed task T, it is natural to associate the largest positive integer n such that 
Cons(n, n — 1) <c T. Pursuing the analogy between oracles and shared objects that we have 
outlined in Section 13.41 this number actually corresponds to the consensus number defined 
by Herlihy in |17j . 
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8 C-reduction between Atomic Commitment and Consensus 



This section is devoted to the C-reducibility between Consensus and Atomic Commitment 
tasks for some given set of processes. Our main results are impossibility results: in Sec- 
tions |H3 an d EiH we show that, for values of resiliency degree greater than one, Consensus 
and Atomic Commitment tasks are indeed not C-comparable. 7 In Section 18,31 we also 
prove a C-reducibility result from Cons(n, 1) to AC(n, 1), concerning the remaining case of 
resiliency degree one. 

8.1 Atomic Commitment cannot be reduced to Consensus 

To make our result as strong as possible, we are going to prove it with a resiliency degree of 
the Atomic Commitment task as small as possible, and a resiliency degree of the Consensus 
task as great as possible. Actually, we prove the following theorem: 

Theorem 8.1 For any integer n, n>2, AC(n, 1) is not C -reducible to Cons(n, n — 1), and 
thus, for any integer f such that 1 < / < n — 1, AC(n, /) is not C -reducible to Cons(n, /). 

Proof: Let II be a set of n process names. Suppose, for the sake of contradiction, that 
there is an algorithm R for the task AC(I1, 1) which uses the oracle O.Cons(II, n — 1). Let p 
be any process in II. Consider a run p =<F, I, H> of R such that, for any q € II, I(q) = sj, 
and for any t 6 T, F(t) = {p}. In other words, p is a run of R in which all processes start 
with initial value 1 and no process is faulty except p which initially crashes. Let d denote 
the decision value in p. 

We are going to prove that d = 0. For that, we introduce the mapping I' which is 
identical to / over II \ [p] and satisfies I'(p) = s®, and we consider p' =<F, F,H>. We 
claim that p' is a run of R. Since p is a run, it is straightforward that p' satisfies Rl, R2, 
R3, R5, and R6. By an easy induction, we see that for any process q, q ^ p, the sequence of 
the local states reached by q are the same in p' as in p. This ensures that every step in H is 
feasible from and so R4 holds in p' . Thus, p' is a run of R, and by the validity condition 
of Atomic Commitment, the only possible decision value in p' is 0. This shows that d = 0. 

Now from p, we are going to construct a failure free run of R by using the asynchronous 
structure of computations. To achieve that, we need the following lemma, where Fq denotes 
the failure pattern with no failure (defined formally by Fo(t) = 0, for any t G T), and H[0, t] 
denotes the prefix in H of events with time less or equal to t. 

Lemma 8.2 For any to € T, there exists an extension Hq of H[0, to] such that <Fq, I, Hq> 
is a failure free run of R. 

Proof: The proof technique is similar to the one of Lemma 17.31 except for Case 3 (T = A). 
In this case, we also form the successive consultations of O.Cons(II, n — 1) in Hq[0, t], and 
focus on the latter consultation. Note that process q has necessarily queried O.Cons(II, n— 1) 
during this consultation; let v be the value of this query. There are two subcases: 

Case 1: CCons(II,n — 1) has yet answered some value d. 
In this case, we let e = (Cons(II, n — l),q,t + ,A,d). 

7 In an unpublished joint work with S. Toueg [S], a weaker version of these results involving only an 
informal notion of reduction was already obtained. It stated that AC(7i, /) is not reducible to Cons(n, /) 
when 1 < / < n — 1, and that Cons(n, /) is not reducible to AC(n, /) when 2 < / < n — 1. Due to the lack 
of a formal model for oracles, the proofs had to be of a different nature, and indeed were based on arguments 
a la Fischer-Lynch-Paterson |16| . 
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Case 2: O.Cons(II, n — 1) has not yet answered. 
We let e = (Cons(II,n - l),q,t + ,A,v). 

We complete the proof of this lemma as the one of Lemma 17.31 □ LemmaS.2 

We now instantiate to to be the time when the last process makes a decision in p. The 
lemma provides an extension Ho of H[0, to] such that po =< Fo, I, Ho > is a run of R. The 
decision value in po is 0, which contradicts the fact that processes must decide on 1 in a 
failure free run of an Atomic Commitment algorithm in which all processes start with initial 
value 1. □ 



8.2 Consensus cannot be generally reduced to Atomic Commitment 

Conversely, we now prove that Consensus is generally not C-reducible to Atomic Com- 
mitment. The proof technique is new and quite different from the one of Theorem 18.11 
basically, it consists in a "meta-reduction" to the impossibility result of Consensus with one 
failure [TT)] . 

As for Theorem 18. II to make our result as strong as possible, we state it with a resiliency 
degree of the Consensus task as small as possible and a resiliency degree of the Atomic 
Commitment task as great as possible. 

Theorem 8.3 For any integer n, n > 3, Cons(ro, 2) is not C-reducible to AC(n, n — 1), and 
thus for any integer f such that 2 < / < n — 1, Cons(n, /) is not C-reducible to AC(n, /). 

Proof: We also proceed by contradiction: let LT be a set of n process names, and suppose 
that there is an algorithm R for Cons(n, 2) using the oracle O.AC(IT, n — 1). Let a denote 
the sanctuary of this oracle. We fix some process p 6 II. From R, we shall design an 
algorithm A running on the system U \ {p}, which uses no oracle. We then shall prove that 
A solves the task Cons(II \ {p}, 1), which contradicts the impossibility of Consensus with 
one failure established by Fischer, Lynch, and Paterson |16j . 

For each process q, we define the automata A(q) in the following way: 

• the set of states of A(q) is the same as the one of R(q); 

• the set of initial states of A{q) is the same as the one of R(q); 

• each transition (s q , [q, m, _L], s' ) of R(q) in which q consults no oracle is also a transi- 
tion of A(q); 

• each transition (s q , [q,m, l],s') of R(q) in which the oracle answers 1 is removed; 

• each transition (s q , [q, m, 0],s') of R(q) in which the oracle answers is replaced by 
the transition (s q , [q, m, _L], s' q ). 

Note that all the steps in A{q) are of the form [q,m, _L]; in other words, the algorithm A 
uses no oracle. 

Let pa =<F, I, H> be any run of A. Each event in H is of the form e = (f3, q, — , — , — ), 
and is part of some transition (s q , [q, m, _L] , s' q ) of A(q), where m € M U {null}. In the 
construction of A(q) described above, this transition results from some unique transition of 
R(q), of the form (s q , [q,m, ±], s' q ) or (s q , [q, m, 0],s'). In this way, to each event in H, we 
associate a unique transition of R(q) in which the oracle is not consulted or answers 0. 
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Now, to each run pa =<F, I, H> of A, we associate the triple pr =<F', V , H'>, where 
the failure pattern F' is defined by 



F' :t <ET — > F'(t) = F(t) U {p} 



the mapping /' by: 



1. if 1(g) 



S® for some process q 7^ p, then I'(p) 



Sp] otherwise I'(p) 




2. for any process q £ II \ {p}, I'(q) = F^q); 
and the sequence H 1 is constructed from H by the following rules: 

1. any event in H that is associated to a transition of R in which the oracle is not 
consulted is left unchanged; 

2. an event {(5, q, t, R, m) in H, which is associated to some transition in R(q) of the form 
(s q , [q, m, 0], s' q ), is replaced in H' by the two events series ((/?, q, t, R, m), (a, q, t, Q, v)), 
where v is the query value in s q ; 

3. similarly, an event (f3,q,t,S,m) in H which is associated to some transition in R(q) 
of the form (s q , [q, — , 0], s' ), is replaced in H' by ((a, q, t, A, 0), (/?, t, S, m)). 

We claim that pj? is a run of R. By construction, there is no event in H 1 whose process 
name is p, and each event in H' at time t corresponds to at least one event in H that also 
occurs at time t. Since H is compatible with F and F'(t) = F(t) U {p}, it follows that H' is 
compatible with F'. For any process q Gil, H\q is well-formed, and so is H'\q. This proves 
that H' satisfies R2. 

From the R3, R4, and R6 conditions for H, it is also immediate to prove that in turn, 
H' satisfies R3, R4, and R6. 

Now since F(t) C F'(t), every process q which is correct in F' is also correct in F, and 
so takes an infinite number of steps in H. By construction of H', it follows that q takes an 
infinite number of steps in H' . Thus H' satisfies R5. 

Finally, to show that pr satisfies Rl, we focus on a consultation of a in H' . By construc- 
tion, the only possible value answered by the oracle is 0. This trivially enforces agreement. 
For validity of atomic commitment, since there is a faulty process in F', the answer is 
allowed for F' and any input vector V G {0, l} n . Every step in H is complete (with a 
receipt and a state change), and so by construction of H' , the oracle answers to each query 
in H'. It follows that H'\a is an history of the oracle CAC(II, n — 1). This completes the 
proof that pn =<F' , J', H'> is a run of R. 

Let pa be any run of A with at most one failure; in the corresponding run pn of R, at 
most two processes fail . Since R is an algorithm for Cons(II, 2), pr satisfies the termination, 
agreement, irrevocability and validity conditions of Consensus. It immediately follows that 
the run pa which pr stems from also satisfies the termination, agreement, and irrevocability 
conditions. Moreover, by definition of I', if all processes start with the same initial value v 
in pa, then they also have the same initial value v in pr; the only possible decision value 
in pr, and so in pa, is v. 

Consequently, A is an algorithm for Cons(II \ {p}, 1) using no oracle, a contradiction 



with dSJ. 



□ 
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8.3 Resiliency degree 1 

Theorem 18.31 establishes that Cons(n, /) is not C-reducible to AC(n, /) when / > 1. In this 
section, we go over the remaining case / = 1: we show that if n > 2, then Cons(re, 1) is 
C-reducible to AC(n, 1). 

In Figure 01 we give an algorithm using an Atomic Commitment oracle which solves 
Consensus in a system II with n > 2 processes if at most one crash occurs. Our Consensus 
algorithm uses the Atomic Commitment oracle only once and only to get some informations 
about failures. More precisely, with the help of this oracle, processes determine whether 
some failure has occurred before each process sends its initial value for Consensus. If no 
failure is detected, then every process waits until it receives the initial value from every 
process, and Consensus is easily achieved in this case. Otherwise the oracle indicates an 
eventual failure. The oracle O.AC (II, 1) allows us to make accurate failure detection (that 
is no false detection), but the delicate point lies in the fact that the failure may occur at 
any time, possibly in the future. We thus had to devise the second part of the algorithm 
in order to deal with this lack of information about the time when the failure occurs. For 
that, we have "de-randomized" Ben-Or algorithm [3]: instead of tossing a coin at some 
points of the computation, processes adopt the fixed value 0. In the resulting algorithm, 
the occurrence of the failure enforces correct processes to make a decision. The complete 
code of the C-reduction is given in Figure El 

Theorem 8.4 For any integer n > 2, Cons(ra, 1) is C-reducible to AC(n, 1). 

Proof: Let p =<F,I,H> denote a run of the algorithm in Figure 03 To prove that p 
satisfies the termination, irrevocability, agreement, and validity conditions of Consensus, we 
shall distinguish the case in which the oracle 0.AC(n, 1) answers from the one in which 
it answers 1 (d = and d = 1). 

Case d = 1. Irrevocability and validity are obvious. 

For termination and agreement, we claim that in this case, all processes query the oracle 
O.AC (II, 1) in p. In proof, if some process p does not consult the oracle, then by property 
Oac (cf- Section ESJ) , the oracle O.AC(U, 1) must answer 0, a contradiction. 

Since before consulting the oracle, every process has to broadcast its initial value, all 
processes that are still alive do receive the n initial values for Consensus. Termination and 
agreement conditions easily follow. 

Case d = 0. First, we claim that p satisfies the validity condition of Consensus. In proof, 
suppose that all processes start with the same initial value v. Every process sends (R, v, 1) 
to all; since n > 2, every process proposes value v at the first round, i.e., sends (P, v, 1) to 
all. As n > 2, it follows from the code that each process then decides v. 

For agreement, we argue as for Ben-Or algorithm. First, because of the majority rule 
which determines when a process proposes value v € {0, 1} (i.e., sends (P,v,r) to all), it is 
impossible for a process to propose and for another one to propose 1 in the same round. 
Suppose that some process makes a decision in p, and let r denote the first round at which a 
process decides. If process p decides v at round r, then it has received at least 2 propositions 
for v at round r. Thus, every process q receives at least one proposition for v at round r, 
and so we have x q = v at the end of round r. This enforces every process to decide v at the 
latest at round r + 1, and to keep deciding v in all subsequent rounds. In other words, p 
satisfies agreement and irrevocability. 
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We now argue termination. Since every query value of O.AC(U, 1) is 1, this implies that 
exactly one failure occurs in run p. For every process p, we consider the round r p process 
p is executing when this failure occurs, and we let r p = max peCorrect ^(r p ) + 1. Suppose 
no process has made a decision by the end of round r„. All correct processes receive the 
same set of n — 1 messages of the form (R,—,r p ), and so they propose the same value 
v E {0, 1, ?} at round r p . If v 7^?, then every correct process decides v since it receives 
n — 1 > 2 propositions for v. Otherwise, v =? and every correct process p sets x p to 0. 
Since n > 2, it is easy to see that in this case, correct processes decides at round r p + 1. 
This completes the proof of termination. □ 

Note that the reduction above is much stronger than the one given by Theorem 16.41 for 
the particular case / = 1. Firstly, Theorem 18.41 establishes that AC(n, 1) is harder to solve 
than Cons(n, 1) whereas Theorem Id 41 i ust compares AC(n, 1) with Cons(n + 1, 1), which is 
shown to be a weaker task than Cons(n, 1) by Proposition 17.41 Secondly, Theorem 18.41 is a 
C-reduction result, and not only a C*-reduction result as Theorem 16.41 is. 

We end this section by deriving an interesting corollary from Theorems 15.41 and 18.41 
On one hand, Theorem 15.41 shows that Cons(n, 1) is not ^-reducible to AC(n, 1). On the 
other hand, Theorem 18.41 establishes that Cons(n, 1) C-reduces to AC(n, 1). Hence, there 
are two unsolvable tasks for which the reductions <k and <c differ. In other words, we 
have proved that in distributed computing, K -reduction is strictly stronger than C-reduction 
(compare with j^Uj, where the relations between various polynomial-time reducibilities in 
classical complexity theory are examined). 

8.4 Extracting Consensus from Atomic Commitment and vice-versa 

Theorem 18 . 31 shows that an oracle for AC(n, /) does not help to solve Cons(n, /), and more 
generally to solve any Consensus task for {1, • • • , n}. However, Theorem 16.41 partially gets 
around this impossibility result by enlarging the set of processes, and so by weakening the 
Consensus task to be solved (cf. Proposition 17.4(1 . Indeed, this theorem asserts that if we 
grant the processes in {1, • • • , n + /} the ability to query oracles of type 0.AC(n, /), then 
/-resilient Consensus is a solvable task. 8 In other words, contrary to Cons(n,/), the task 
Cons(n + /, /) can be extracted from AC(n, /). 

Conversely, we may wonder whether enlarging the set of processes {1, • • ■ ,n} could 
make Atomic Commitment tasks solvable if we grant the processes to consult oracles of 
type O.Cons(n, /). In fact, as an application of our previous results, we may prove that no 
/-resilient Atomic Commitment task can be extracted from O.Cons(n, /): 

Proposition 8.5 For any integers n, m, and f such that 1 < / < n — 1 and n < m, 
AC(m,/) is not C* -reducible to Cons(n,/). 

Proof: For the sake of contradiction, assume that AC(m, /) <c* Cons(n, /), that is 

AC(m, /) < c {Cons(n, /) : n C {1, ■ • • , m} and |H[ = n}. 

By Proposition 17.51 applied m — n times, each Cons(n, /) task C-reduces to Cons(?n, / + 
m — n). Using strong transitivity of the C-reduction, we get that AC(m, /) <c Cons(m, / + 
m — n). As n < m, we trivially have Cons(m, / + m — n) <c Cons(m, /), and so it follows 

8 Thanks to the introduction of the k- Threshold Agreement tasks, we shall prove a better result in Part II, 
namely that Cons(n + / — 1, /) C*-reduces to AC(n, /). 
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that AC(m, /) <c Cons(m, /), which contradicts Theorem 18.11 



□ 



Roughly speaking, this proposition states that Consensus contains no Atomic Commit- 
ment component. Together with the reducibility result in Theorem 16.41 alluded above, this 
corroborates the popular belief that Consensus is easier to solve than Atomic Commitment. 
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Variables of process p 



x p £ V, initially v p 
r p £ H, initially 1 

Algorithm for process p : 

Send(v p ) to all 

0«en/(o.AC(n,i))(i> 

Answer(O.AC(U,l))(d) 

if d= 1 

then 

wait until [Receive (v q ) from all 5 £ II] 
iCp := mm 9e n(w g ) 
Decide (x p ) 
else 

repeat 

Send{{R,x p ,r p )) to all 

wait until [Receive ((R, *,r p )) from n — 1 processes] (where * can be or 1) 

if more than n/2 messages have the same value v £ {0, 1} in the second component 

then 

Send((P,v,r p )} to all 
else 

Send((P,?,r p )) to all 
wait until [Receive((P, *, r p )} from n — 1 processes] (where * can be 0, 1, or ?) 
if at least two of the ((P, *, r p ))'s received have the same w £ {0, 1} in the second component 
then 

x p := w 

Decide(w) 
else 

if one of the ((P, *, f p ))'s received have w £ {0, 1} in the second component 
then 

x p := w 
else 

x p := 
r p := r p + 1 



Figure 3: A C-reduction from Cons(?T,, 1) to AC(n, 1) 
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